LARGE DEVIATIONS FOR WEIGHTED EMPIRICAL MEAN 

WITH OUTLIERS 
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Abstract. We study in this article the large deviations for the weighted empirical 
mean L n = j- J^™ f(s") • Zi, where (Zi)ien is a sequence of Revalued independent 
and identically distributed random variables with some exponential moments and 
where the deterministic weights f(x") are m x d matrices. Here f is a continuous 
application defined on a locally compact metric space {X,p) and we assume that 
the empirical measure — wea kly converges to some probability distribu- 

tion R with compact support y. 

The scope of this paper is to study the effect on the Large Deviation Principle 
(LDP) of outliers, that is elements £™( n ) £ {x™, 1 < i < n} such that 

lim inf p(x? M , y) > . 

n — too 

We show that outliers can have a dramatic impact on the rate function driving 
the LDP for L n . We also show that the statement of a LDP in this case requires 
specific assumptions related to the large deviations of the single random variable 
This is the main input with respect to a previous work by Najim [11] , 
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1. Introduction 

The model. We study in this article a Large Deviation Principle (LDP) for the 
weighted empirical mean 



1 

L n = — / f O^i ) ' Zi, 



n 
1 

where (Zj)j g N is a sequence of IR^-valued independent and identically distributed 
(i.i.d) random variables satisfying: 

Ee a|Zl| <oo for some a > 0. (1.1) 

The application f : X — > W nxd is a m x d matrix-valued continuous function, (X, p) 
being a locally compact metric space. The term f {x) ■ Z denotes the product between 
matrix f(x) and vector Z. The set {xf, 1 < i < n, n > 1} is an ^-valued sequence 

of deterministic elements such that the empirical measure R n = — Yl!i=\ satisfies: 

Rn R , (1.2) 
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FIGURE 1. The rate function of ± J27=i X i where the XJs arc W(0, 1) 
Gaussian i.i.d. random variables (left); the rate function of — Xf + 

—X% (right). Both rate functions coincide for x < | but the right one is 
linear for x > §. 



where R is a probability measure with compact support y. 

We focus in this paper on cases where there are outliers, that is where some of the 
x™ remain far from the support (also called bulk) of R. Loosely speaking, one can 
think of an outlier as a sequence {x^ n yn > 1) satisfying: 

Kminfp(iJ nV y)>0. (1.3) 

At a large deviation level, such outliers may have a dramatic impact on the shape 
of the rate function as demonstrated in the simple example of Figure [TJ Although 
the model under study looks very similar to the LDP studied in [11] , the presence of 
outliers substantially modifies the resulting LDP and may naturally create infinitely 
many non-exposed points (see the definition in [7] and also Remarks 13.31 and I4.2p for 
the rate function. 

The purpose of this article is to provide clear assumptions (which cover situations 
where (|1.3|) can occur) over the set {i(xf), 1 < % < n, 1 < n} and over Zj under 
which fairly general LDP results can be proved. 

Motivations and related work. Such models are of particular interest in the field 
of statistical mechanics (spherical spin glasses in [I], spherical integrals in the finite 
rank case in [9], etc.) where one has often to establish a LDP for the empirical mean 
L n in the case where the random variable Z, satisfies condition (jl.ip . In particular, 
spherical integrals are intimately connected to the study of Deformed Ensembles (see 
[12j for instance for the definition) in Random Matrix Theory. In dimension one, Z, is 
typically the square of a Gaussian random variable. The measure — Y^=i i s then 
a realization of the empirical measure of the eigenvalues associated to a given random 
matrix model and there are important cases when some of the xf's stay far away 
from the support of R. Indeed, there has recently been a strong interest in random 
matrix models (so-called spiked models) where some of the largest eigenvalues lie out 
of the bulk, that is where the set of limit points of (xf, 1 < i < n, n > 1) can differ 
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from the support of R (see Johnstone |10j . Baik et al. [2], [3], Peche [12] ). These 
spiked models are of particular interest for statistical applications [10] . 

The study of the LDP for weighted means was developed by Bercu et al. [5] for 
Gaussian functionals and considered in greater generality in Najim [11]. In [11] . the 
LDP is stated for L n under condition (jl.ip but in the case where (xf, 1 < i < n, 1 < 
n) is a subset of y, the support of the limiting probability measure R. In particular, 
the framework of [TT] does not allow any of the x™'s to lie far from the bulk. LDPs 
involving outliers can be found in Bercu et al. [5], Guionnet and Mai'da [9]. For 
related work concerning quadratic forms of Gaussian processes, we shall also refer 
the reader to Bercu et al. [1], Gamboa et al. [8], Bryc and Dembo [6] and Zani [15] , 

Presentation of the results. The purpose of this article is to establish the LDP for 
the empirical mean L n under the moment assumption (|l.ip and under assumptions 
which allow the presence of outliers (see (|1.3p ). Such a LDP will rely on the individual 
LDP for This is the content of the following assumption. 

Assumption A-l. The M. d -valued random variable Z± satisfies the following expo- 
nential condition: 

Ee a ' Zl ' < oo for some a > 0, 
and ^ satisfies the LDP with a good rate function denoted by I. 

Note that if — does not satisfy a LDP, one can construct counterexamples where 
L n does not fulfill a LDP (see for instance [111 Section 2.3]). Finally, two subcases 
of Assumption (A-Q]) yield to two distinct classes of results: 

The case where I is convex (Assumption (A\j^j, Section \2. 3\) . This paper is mainly 
devoted to the study of this case. If / is convex then the assumptions on the sets 
= {f(x"), 1 < i < n, 1 < n} needed to state the LDP for L n are quite mild. 
Apart from a standard compacity assumption (Assumption ( A-[3]) , see Section I2.3|) , 
the main assumption over (Assumption (A-|4|), Section I2.3[) bears on the sole 
limiting points of (in the sense of Painleve-Kuratowski convergence of sets) and 
on their role in the LDP. It turns out that (A-(3|) is an intricate assumption concerning 
the limiting behaviour of and some limiting points of involved in the definition 
of a certain convex domain. This convex domain plays a role in the definition of the 
rate function of the LDP. As demonstrated by examples in Section 12.21 ( A-UJ) covers 
a wide variety of models with outliers in the convex case, at least those for which a 
LDP is to be expected. 

Under Assumptions (A-[T])-(A-|4]) and the more classical assumption (A-{5]) (conver- 
gence of R n to R), the empirical mean L n satisfies the LDP with a good convex rate 
function (Theorem 13. 2j) . This rate function admits a fairly good representation (in 
terms of convex features) where the role of the outliers is quiet transparent (Theorem 
13.61 and examples in Section [4]) . 

The case where I is not convex. In this case, one can still prove the LDP but the 
assumptions over are much more stringent and the rate function is given by an 
abstract formula. Moreover, very few insight can be gained by the study of the 
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general formula of the rate function. It seems that the study must be held on a 
case-by-case analysis. 

Outline of the article. In order to study the Large Deviations of L n , we shall 
separate outliers from the bulk and split accordingly L n into two subsums: 

n ^-^ n ^-^ 

{x™ far from the bulk} {x™ near or in the bulk} 

A ~ 
— 7T n + L n . 

The idea is then to establish separately the LDP for each subsum. This line of proof 
has been developed in the one-dimensional setting for Gaussian quadratic forms by 
Bercu et al. [5] and is extended to the multidimensional setting in this article. 

The paper is organized as follows. Sections [21 [3] and H] are devoted to the study of 
the convex case. 

In Section we study the Large Deviations for the following model: 

Tr n = - Y f (x?) • Z t where > . (1.4) 

n n n— >oo 

The main assumptions related to the set = {f(xf); xf E C n } are stated and the 
LDP for TT n is established. 

In Section El the decomposition L n = ir n + L n where 7r n satisfies (ll.4p is precisely 
specified, the LDP for L n is established and a representation formula is given for the 
rate function. Section U] is devoted to examples of LDPs with outliers in the convex 
case. 

A general LDP stated with an abstract rate function is established in the non- 
convex case in Section [5j In Section [61 a partial study of the rate function is also 
carried out in the non-convex case in the setting of a specific example. 

Comments related to the link between the study of the spherical integral and the 
LDP of L n are made in Sections 0] (rank one case) and [6] (higher rank). 

2. The LDP for the partial mean 7r n in the convex case 

Let (C n ) n >i be a finite subset of X. This section is devoted to the study of the 
LDP of 



7Tn = - y f(~- - ----- card(a) 



• Zi where > 0, 

n * — ' ' n rwoo 

with card(C n ) standing for the cardinality of the set C n . It will be proved in Section 
ETT] that L n can be decomposed as ir n + L n with 7r n as above. 

Remark 2.1. In the case where the random variable Z\ satisfies 

Ee a|Zl1 < oo for all a £ R + , (2.1) 
the following limit holds true: 

limsup — logP{|7r n | > 5} = —oo for all 5 > 0. 

n— *oo n> 
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Otherwise stated L n and L n are exponentially equivalent and 7r n does not play any 
role at a large deviation level. Of course the situation is completely different if (|2.ip 
does not hold. 

We first introduce some notations as well as the concepts of inner limit, outer 
limit and Painleve-Kuratowski convergence for sets. We then state the assumptions 
over the sets = {f(xf), xf € C n } and prove the LDP for 7r n . 

2.1. Notations. Denote by B{Z) the Borel sigma-field of a given topological space 
Z (usually R d , K m , R mxd or X). Denote by | • | a norm on any finite-dimensional 
vector space (M. d , W 71 or W nxd ). In the sequel, we use bold letters a, b,y, etc. to 
denote mxd matrices. We denote by (•, •) the scalar product in any finite-dimensional 
space and by • the product between vectors and matrices with compatible size. Let 
A be a subset of M. k . We denote by A its closure, by int(^4) its interior, by A(- | A) 
the convex indicator function of the set A and by A*(- | A) its convex conjugate 
(also called the support function of A), that is: 

A(0 | A) 
A*(y | A) 

where y and 6 are in M. k . The following proposition whose proof is straightforward 
will be of constant use in the sequel. 

Proposition 2.1. Let A be a subset of~M, h , then 

A*(- | A) = A*(- | A). 

If moreover A is convex with non-empty interior, then 

A*(- | int(A)) = A*(- | A) = A*(- | A). 

Let D n be a sequence of subsets of M. mxd . We define its outer limit (denoted by 
Ax5,out) and its inner limit (denoted by -Doo.in) by 

£>oc,out = |xGlR mxd , 30:N^N increasing, 3x 0(n) G D^, x^ n ) -^^> x| 
£>oo,in = {x S M. mxd , 3 no, Vn > no, 3x re 6 D n , x n > x j 

The limit of the sets (D n ) exists if the outer limit and the inner limit are equal. 
Set convergence in this sense is known as Painleve-Kuratowski convergence and in 
this case, we will denote: 

D n pk > Deo. 

n— »oo 

For more details on Painleve-Kuratowski convergence of sets, see Rockafellar and 
Wets pU Chapter 4]. 





oc 

sup{(y,6>) 



if 9 E A, 
else. 



A(6 | A)} 



sup(y,0), 
eeA 
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A*(z) 



2.2. A preliminary analysis: Two simple examples. Consider 

C' n = {f (x? ), x% € C n } where card ( C ") ^ . 

n 

The sets in and out are respectively the inner and outer limits of (C^). In the 
study of the forthcoming examples, we will focus on the links between the LDP for 
7r n and the sets in and out . This section is aimed at introducing Assumption 
(A-[5]) but can be skipped as no further notation is introduced. 

2.2.1. Example 1: A simple case where the LDP fails to hold for ir n . Let A" be a 
standard Gaussian random variable and consider n n = — K ' X . Direct computa- 
tions yield the LDP for 7T2 n (resp. vr2 n +i) with good rate function A* vcn (resp. A* dd ) 
where 

A* (z) - i 2/6 if Z > °' and A* (z) -{ z l 2 if z > °' 
A — [Z > ~ \ oo else. and A ° dd(zj ~ \ oo else. 

Therefore one cannot expect the LDP for (7r n , n € N) . 

2.2.2. Example 2: The LDP holds after modification of Example 1. Let X and Y be 

2_l_( \\n 

independent standard Gaussian random variables and consider ir n = — v 1 X + 
-Y 2 . In this case, H2n and vr2 n +i satisfy the LDP (by a direct analysis) with the 
same rate function 

' z/8 if z > 0, 
oo else. 

This yields the LDP for the whole sequence (7r n , n 6 N) with rate function A* . 

Despite the erratic behaviour of 2+ (~ 1 ) x 2 (as seen in the previous example), the 
LDP holds due to presence of the term -^Y 2 . 

2.2.3. Comparison of the two examples. Denote by 

V y = {A G E, logEe Aj/X2 < oo} = (-oo, {2y)- 1 ) 

where X is a standard Gaussian random variable. 

In the case of Example 1, one can easily check that C| n = {3} and C| n , 1 = {1}. 
Thus out = {1,3} while in = 0. It is straightforward to check that the rate 
functions driving the LDP of Ti2n and vr2 n +i can be expressed as: 

^even(- 2 ) = SU P ^ z an d ^odd( z ) = SU P 

\ev 3 Aex>i 
The very reason for which the LDP does not hold in this case is that 

n v v * n v v 

In the case of Example 2, C| n = {3,4} while C| n+1 = {1,4}. Therefore out = 
{1, 3, 4} while C£, in = {4}. Despite the fact that C£, )OUt / C^ in , the LDP holds in 
this case with good rate function given by: 

A*(z) = sup Xz. 
Ae£>4 
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As we shall see, the underlying reason for which the LDP holds is 

n v v = n v v (= ^) > 

yec*, iOUt yec^ 4n 

and this will be a key-point in the statement of Assumption (A-[4]). 
We are now in position to state the assumptions and the main result. 

2.3. Assumptions and main results. Let C n be a finite subset of X and recall 
that 

Ci = {f«), x? G C n } where ► 0. 

77, n— >oo 

Let y be a m x d matrix and denote by 

V y = | A £ M m , logEe <A ' y ' Zl) < oo} . (2.2) 

We can now state our assumptions. 

Assume that Z\ is a IR^-valued random variable satisfying Assumption (A-Q]) and 
recall that I is the rate function associated to — . 

n 

Assumption A-2. Let V z = {6 £ R d , logEe <e ' Zl> < oo}, then 

I(z) = A*(z | V Z ). 
In particular, I is a convex rate function. 

Assumption A-3. Let (D n ) n >i be a sequence of non empty subsets ofW nxd . There 
exists a compact set K C W nxd such that D n C K for every n > 1. 

Remark 2.2. This assumption implies in particular that the outer limit -Deo, out of 
{D n ) n >\ is a nonempty compact set of ]R mxrf . 

Assumption A-4. Let (D n ) n >i be a sequence of subsets ofW nxd . Denote by -Doo,in 
and -Doo.out its inner and outer limits. Then: 

yG-Doo,in ye-Doo,out 

where V y is defined by 

Remark 2.3. If (D n ) n >i fulfills (A-[3]) and (A-0]), then in particular, -Doo.in is not 
empty. 

We can now state the main result of the section. 

Theorem 2.2. Assume that (2j)j 6 ^ i- s a sequence ofM. d -valued i.i.d. random vari- 
ables. Assume moreover that (A{J^) and (A^ hold for Z\. Assume that (X,p) is a 
metric space and let C n C X be such that 

card (C n ) ^ 

77 n^oo 
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Denote by C^ n = {f(xf), x™ G C n } where f : X — > M. mxd is continuous. Assume that 
(A^\) and (A^ hold for the sequence of sets (C^^n- Then the random variable 



x"<=C n 

satisfies the LDP in (R m ,S(R m )) with good rate function 

A*(z | V) = sup{(A, z), AeP} where V = f] V y = f] V y . 

Remark 2.4 (On Assumption (A-0])). A close look to the proof of Theorem 12,21 shows 
that the rate function that drives the lower bound of the LDP is the support function 
of n GC t V Y while the rate function that drives the upper bound is the support 

function of n v(zr t V Y . Both rate functions coincide when assuming (A-HJ). (see 

y c oo,out 

also the examples in Section I2.2[) . 



2.4. Proof of Theorem 12.21 In order to prove Theorem l2.2l . we follow the strategy 
developed in [11], essentially based on an exponential approximation technique. The 
next proposition is the counterpart of Lemma 5.1 in [11] . 

Lemma 2.3. Let 6 : N\ {0} — > N \ {0} be such that ^ ► 0. Let (ZA be 

n— too 

a sequence of WL d -valued random variables satisfying (AUty and (A^E). Then zt = 
— Ylt=i satisfies the LDP in M. d with good rate function given by 

L(y) = A*(y\V z ) 

where T>z is defined in (A^. 

Proof. Denote by A„ the log-Laplace transform of Z%, i.e. An(9) = logEe^'^. 
Then 

±A*(ne) = ^logE e«^> ► A(9 \ V z ). 

n n n—*oo 

Therefore, the large deviation upper bound holds for 2t with rate function / by 
Theorem 2.3.6 (a) in [TJ. To prove the large deviation lower bound, it is sufficient to 
prove that 

-I(y) < lim inf - log P (z* G B(y, e) 

n— >oo n V 

where B(y,e) = {y' G R d , \y' — y\ < e}. Define 

Z* = f ^Ti^Zi if<^(n)>2, 
\ otherwise. 

Then {Z 1 /n G B(y,e/3)} n {zt G B(0,e/3)} C {2t G B(y,e)} which yields 

-logP(Zi/n G B(y,e/3)) + -logPfe G J3(0,e/3) 
n n \ 

<^logP(z*GB(y,e)). (2.3) 
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Exponential Markov inequality yields linin—joo F{\Z£\ > e/3} = which readily 
implies that linin^oo F{zt £ 5(0, e/3)} = 1. Consequently, taking the liminf in 
both sides of (|2.3p and using the lower bound for the single variable ^ yields the 
desired lower bound. The proof is completed. □ 

We first consider Theorem 12.21 under an additional assumption. 

Lemma 2.4. Under the same assumptions as in Theorem \2.2\ and if we assume in 
addition that 

C* Cl, (2.4) 

n— >oo 

then 7r n satisfies the LDP in R d with good rate function A*(- | V), where T> = 
Proof of Lemma 12.41 is postponed to Appendix lAl 

We now relax the extra assumption (|2.4p and prove Theorem 12.21 The scheme of the 
proof is the following. We first show, using directly the result in Lemma [2.41 that the 
lower bound is driven by the support function of the set flyec f ■ We then obtain 
that the upper bound is driven by the support function of the set f] var f D y , by 
majorizing the log-Laplace of 7r n . Under Assumption (A-0]), both bounds coincide 
and we get the full LDP. 

Proof of Theorem \2.2\ To get the lower bound, we split into two disjoint subsets: 

C f n =l f n UO f n where l f n C^ in (2.5) 

Let us sketch the construction of T^. Let B(z, ^) be a ball centered in z E in 
with radius — . Since in is compact by (A-[3]), there exist (zi) 1<£<L such that 

CL,in C|j5 (z e , and B (z t , D ±% for 1 < t < L m . 

The mere definition of in yields that there exists ip(m) such that for all £, 1 < 
Vn>^(m), 3f(rf)6BL-) with f«)eC[ 



Denote by ^4 n .m (w > ip(m)) such a collection of f(x^)'s. Choose now similarly a 
collection of balls with radius ^-j- and the related y>(m + l) with ip(m + 1) > ip(m) : 
and set 

= Ai,m if ^(m) < n< V(m + 1). 

n > Coo in- We wrr te 

n n ^-^ 

= 7r„ + 7T„ . 
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The lower bound can be established as in Lemma 12.31 Let us prove that: 

-A*(z|n vprf T> Y ) < liminf — logP (ir n G B(z, e)) . (2.6) 

Since 

{tt£ G 5(z,e/3)} n{7T r ? G B(0,e/3)} C {vr n G B^e)}, 

one has 

- log P {if n G B(z, e/3)) + - log P (t£ G 5(0, e/3)) 

<-logP(7r n €J3(z,e)). (2.7) 
n 

Exponential Markov inequality yields limn-^oo P(|vr^| > e/3) = 0. This in turn 
implies that limn-^oo P (tt® G -6(0, e/3)) = 1. Since ir^ fulfills assumptions of Lemma 
12.41 the following lower bound holds: 

-A*(z\ n yecL i Vy) < \ log P G B(z, e/3)) (2.8) 

Consequently, taking the liminf in both sides of (|2.7p and using (|2.8p yields the 
desired lower bound. The proof of the lower bound is completed. 

Let us now prove the upper bound. Denote by A n (A) the log-Laplace transform 
of 7r n , i.e. (A) = logEe^' 71 ™^. In order to prove the upper bound, we estimate the 
following limit: 

-A„(nA) = - y logE e <W)- z *> where caxd ( C "< 



— ► 0. 

i — ' n n— >oo 

We shall prove that 



limsup-A„(A) < A(A I int(n vf=r , f V y )). (2.9) 

„ ri y ^ oo,out 



n— >oo Ti 

Theorem 4.5.3 in [7] will then yield: 
1 

n " x ze-F 
(a) 



limsup - logP(vr n G F) < - inf A*(z I int(n v(=r * P y )) 



inf A*(z | n vpr r P y ) (2.10) 

for any closed set F. Equality (a) follows from Proposition 12.11 and the fact that 
int(n v(=r -f 2? v ) is a non-empty convex set due to (A-QT). 

In order to prove (|2.9p . consider A G M d such that 

limsup -A n (nA) > 0. (2.11) 

n—*oo 71 

iFrom (|2.1ip . we can successively: 

- extract a subsequence n a from n such that 



l im J_ y logEe< A ' f « a ) ^ >0; 
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- extract a subsequence np from n a such that 

riQ 

lim Ee< A ' f ( x * >"^> = oo, 

n— >oo 

- extract a subsequence ?i 7 from such that 

f Or 7 ) ► yo- 

n— »oo 

One can notice in particular that yo G out . 
Let us now prove that 

A^int(P yo ). (2.12) 

Assume that (|2.12p is not true. Then there exists p > 1 such that pX G D yo . Let e > 
be arbitrarily small. Then, if n is large enough to ensure that |A||f(x" 7 ) — yo| < e/g 
where 1/p + 1/q = 1, one has 



Ee (A,f(x7).Z> = Ee (A,y -X> e (A,(f« 7 )-yo)-X) 



This contradicts the fact that 

lim Ee< A ' f ( x ^) z< > = oo. 

n-^oo 

Therefore (12 . 12f) holds and yields that A 4 intfn v(zC t PyJ. From this, we deduce 

y^- oo,out 

that 

limsup — A n (raA) > =4> A ^ int(n vpr f V y ). 
Otherwise stated: 

limsup -A n (nA) < A ( A I int(n vpr7 f Py)) . 

n-»oo n V y^^oo.out J / 

Therefore, (|2.9p is proved and so is (|2.10l) . 

Gathering the lower bound (|2.6p . the upper bound (|2.10p and Assumption (A-0J) 
yield the full LDP for vr n . □ 



3. The LDP for the empirical mean and the rate function in the 

CONVEX CASE 

Our goal is now to get the full LDP for L n (Theorem 13. 21 below). As announced in 
the outline of the article, the first step is to split the x"'s into two different subsets 
according to whether they live near the support of the limiting measure or whether 
they are outliers. 
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3.1. The decomposition L n = vr n + L n . Recall that (X,p) is a metric space. 
Proposition 3.1. Let A n = {xf, 1 < i < n}. Assume that 

-1 n 

* 1 v _ weakly 

tn = — > d x n > 

n z — ' 1 n— >oo 



R. 



i=l 



and denote by y the support of R. Then there exist subsets B n and C n = A n \ B n 
such that 

H\ card(-B n ) ^ ^ 

n n— >oo 
/ \ 1 X weak ly D 

(3) p(B n ,y) ► where y is the support of R. 

n— >oo 

We will then set 



L n = - V f(x?)-Zi and % = - V f (s?) • Z t . 
n ^-^ n 

x?£B n xfeC n 

Note that since card(.B n ) + card(C n ) = n, property ([1]) yields then that card ( c ' n ) 







as n goes to infinity. 

Proof. Construction of B n . Let m > 1 be fixed and denote by y m the ^-blowup 
of y, i.e. y m = {x £ X, p(x,y) < ^} where y is the support of R. Then 
n ly m ( x i ) ~~ * 1) m particular there exists i/> m > 1 such that for all n > ifj m : 

n 



i=i 



rn 



One can then build recursively a sequence of integers (VVn)meN such that ip m < VVn+i 
(so that ip m — ► oo as m — > oo). Set 

B„ = {x™ e Jm, 1 < i < n} for ^ m < n < tp m +l- 

We prove property ([1]) and leave the proofs of properties ([2]) and (|3|) to the reader. 

Let e > be fixed and take m such that < e. For such an m, take the 
corresponding ip m and let n > ip m . Then, 



card(-B n ) 



n 



l 



I? 



1 

< — < £■ 
m 



Since e > is arbitrary, property (pQ) is proved. 



□ 



3.2. The LDP for the empirical mean L n . In order to get the full LDP for 

L n = L n + 7r n , we need to prove the LDP for L n . We will mainly rely on the results 
in [11]. The following assumption is needed: 

Assumption A-5. Assume that (X,p) is a locally compact metric space. The family 
(xf, l<i<n,n>l)cX satisfies 



R 



1 n 

n — / , X 

n 



weakly 



R. 



i=i 
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where R is a probability measure over (X,B(X)). Moreover, the support of R denoted 
by y is a compact set and for every non-empty open set U of y (for the induced 
topology overy), R(U) > 0. 

Remark 3.1. The LDP may fail to hold if the last part of Assumption (A-[5]), that is 
R(U) > for U non-empty open set, is not fulfilled. Counterexamples, also closely 
related to Assumption (A-[TJ), are developed in [TT] , 

We recall that we denote by A(0) = logEe^'^ the log-Laplace transform of Z\. 
We introduce the following functional 

r(A) = jf A (f^ J R(dx), (3.1) 

where A = (Ai, • • • , A m ) E K m and fj. denotes the k th row of matrix f. Let T* be the 
convex conjugate of F: 

T*(z)= sup {(A,z)-T(A)}. 

AGM m 

We can now state the LDP. 

Theorem 3.2. Let (^)ign be a sequence ofR d -valued i.i.d. random variables where 
Zi satisfies (AUj) and (A$2j). 

Consider a triangular array {xf, l<i<n,n>l)cX which fulfills (A\^). 

Denote by = {f(xf), xf £ C n } where C n is a subset of {xf, 1 < % < n} given 
by Proposition \3.1\ and f : X — > M mxd is continuous. Assume that satisfies (04 -[^ 
and (04-0). Then 

1 n 

n L — ' 

l 

satisfies the LDP in (M m ,,6(M m )) with good rate function 

I t {z) = inf{r*(zi) + A*(z 2 | V), Zl + z 2 = z} , 
where the definition of T> follows from Theorem \2.2l 
Proof. Recall the decomposition L n = L n + 7r n where 

l n = ~ E f (*i)- Z i and 7r " = - Y, f (0" Z 



n t — n 



( - 



where the sets B n and C n are defined in Section 13.11 Theorem 12.21 yields the LDP 
for 7r n with good rate function A*(- | T>). It remains now to prove the LDP for L n . 
We will rely on Theorem 2.2 in [11] and therefore slightly modify L n so that it fulfills 
the assumptions of this theorem. 

In fact, it is required in [TT] that all the points xf belong to y, which might not 
be the case here. We build in the sequel a sequence (r(xf)) C y which approximates 
the sequence (xf,xf £ B n ). Let xf £ B n and set 

xf if xf e y, 

one of the argmin{p(x, xf), x E y} else. 



r(xf) 
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Such a minimizer always exists and belongs to y since 3^ is compact. 

Since lim n , sup{p(x, y), x S B n } = 0, one has sup x n gBri p(xf,r(xf)) > and 

« n (f) ^ sup {|f(x?) - f(r(x?))\} > 0. 

Indeed, for n large enough, I? n lies in an e-blowup of y, which is compact since X 
is locally compact and f is therefore uniformly continuous on this set. 
Now, if we define L n by 

then L n and L n are exponentially equivalent. Indeed, 

-lo g p(|L„-I„|> £ ) < -logP^- g W>^J 

where A* Z | stands for the convex conjugate of the log-Laplace transform of \Z\. The 

measure L n satisfies all the assumptions of Theorem 2.2 in [11]. Therefore, the LDP 
holds for it with good rate function V*. Finally the exponential equivalence yields 
the LDP for L n with the same rate function (see for instance Theorem 4.2.13]). 

As the two subsums are independent, the contraction principle yields the LDP for 
L n with good rate function If given by: 

It(z) = inf{r*(zi) + A*(z 2 | V), Zl + z 2 = z}. (3.2) 

□ 

3.3. More insight on the rate function If. In the convex case, that is when 
Assumption (A-[2]) holds, the rate function If can be expressed more explicitely. 
This section is aimed at describing how to perform the inf-convolution (|3.2p . 

We first introduce some definitions from convex analysis (see e.g. p3]). The main 
result is stated in Theorem 13.61 

Definition 3.3 (Normal cone). Let C C M. d be a convex set and let a G C. The 

normal cone of C at a, denoted by Nc(a), is defined by: 

N c {a) = {ze R d ; (z,x- a) < 0, ViG C}. 

Remark 3.2. In particular, if z G Nc(a) then A*(z \ C) = (z,a). 

Definition 3.4 (Relative interior). Let C C M. d be a convex set. Its affine hull, 
denoted by aff C, is the smallest affine subset ofW 1 containing C. The relative interior 
of C, denoted by riC, is defined by: 

riC = {x e affC, 3e > such that (x + efl(0, 1)) D aff C C C} 
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Definition 3.5 (Subdifferential of a convex function). A vector x* is said to be a 
subgradient of a convex function f at a point x if for any z, 

f(z) > f(x) + (x*,z-x). 

The subdifferential df(x) of f at x is the set of all subgradients of f at x. 

We can now state: 

Theorem 3.6. Under the assumptions of Theorem \S the rate function If admits 
the following representation: 

h (z) = sup«A,z> - T(A)) , (3.3) 
x&v 

where V is given by (|3. 1[) . Furthermore, for any z G ridom/f, we can decompose z 
as z = z* + z n , where there exists A* G domT n V such that: 

(i) z* G dT(X*) and 

(ii) z n G Nf,(\*). 

In particular, for any such decomposition, 

I f (z)=T*(z*) + A*(z n | V). 

Remark 3.3 (Non-exposed points). Let z G ridom/f. Consider the decomposition 
given by Theorem 13.61 namely z = z* + z n , then: 

Vt G M + , I f (z* + tz n ) = T*(z*) + t{zn, A*) where G 3r(A*) and z n G A^(A*). 

In particular if z n ^ 0, If is affine in the direction R + 9 t i— > z* + iz n and has 
thus infinitely many non-exposed points (see for instance the example developed in 
Section 0}. 

Proof. We first prove (|3.3p . Theorem 13.21 and Proposition 12.11 yield 

If(z)= inf {r*(z x ) + A*(z 2 | 2?)}. 

Z = Z1+Z2 

As If, T and A(. | V) are convex, proper and lower semicontinuous, we get from 
Theorem 16.4 in [T3] that 

If{z) = [r + A(. | £)]*(*), 

= sup {(A, z) - T(A) - A(A | T>)}, 

\£& d 

= sup{(A,z)-r(A)} = sup{(A,z)-r(A)}, 
\et> Aex> 

and (13. 3p is proved. As If is convex, so is its domain and we can consider its relative 
interior ridom/f. Let z G ridom/f, then If{z) < +oo and define F z by : 

F z (x) = T*(x) + A*(z -x\V). 

The properties of T* and A*(. | T>) yield that F z is proper, convex and lower semi- 
continuous; its level sets are compact. In particular, the infimum of F z is attained 
over M. d . Let z* be a point where this infimum is attained, i.e. 

inf F z {x) = F z (z*). 
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In this case, 

G 8F z (z*). 

In order to go further in the proof, we shall describe dF z (z*) in terms of dT* and 
dA*(z — ■ | T>). This is the purpose of the following proposition: 

Proposition 3.7. If z G ridom/f , then for any x, 

dF z {x) = dT*(x) - dA*(z -x\V). 

Proof of Proposition \3. 7\ Define f z to be the function given by f z {x) = A*(z— x \ T>). 
Note in particular that F z (x) = T*(x) + f z (x). Since J f (z) = inf^ =2l+Z2 {r*(zi) + 
A*(z2 | T>)}, the sum of the epigraphs of T* and A* are equal to the epigraph of If. 
This immediatly implies that 

dom/f = domT* + dom A*(- | T>). 

These sets being convex, Corollary 6.6.2 in [13] yields 

ridom/f = ridomT* +ridomA*(- | T>). 

Let z G ridom/f, then there exists y £ ridomT* such that z — y G ridomA*(- | T>). 
This is equivalent to the fact that y G ridom/ z (x) and therefore 

ri dom T* n ri dom /, / 0. (3.4) 

Theorem 23.8 in [13] whose main assumption is fulfilled by (|3.4p yields then 

dF z {x) = dr*(x) + df z (x) 

= dT*(x) - dA*(z -x\V) 
and Proposition 13.71 is proved. □ 
Let us now go back to the proof of Theorem 13.61 By Proposition 13.71 

8F z {z*) = dT*{z*) - dA*{z - z* \ V). 

Since G dF z (z*), there exists A* G dT*(z*) such that A* G dA*(z - z* \ V). By 
applying Theorem 23.5 in [13], one obtains 

A* G dT*{z*) z* G dT{\*) 

which in particular implies that A* G domT. Moreover, 

-A* G dA*{z -z*\V) O z - z* G dA{\* \ V) 

<=> z-z* G Nf,(X*), 

which in particular implies that A* G T>. 

Denote by z n = z — z* , then one obtains the decomposition stated in Theorem 13.61 
It remains to prove that: 

I { (z) = T*(z*) + A*(z n I V). 

We have: 

If(z) = sup{(\,z) -T(A)} 
agx> 

> (\*,z*)-r(\*) + (\*,z n ) = T*(z*)+A*(z n \V). 
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On the other hand, 



I f (z) = sup{(A,z)-r(A)} 
Ae£> 

< sup{(A,z*>-r(A)} + sup(A,z n > = T*(z*) + (X*,z n ), 
xev Aex> 



and Theorem 13.61 is proved. □ 

4. AN EXAMPLE OF LDP IN THE CONVEX CASE 

To illustrate the range of Theorems 13.21 and 13.61 we study in detail the following 
model : 

L n = \fyW) ■ Zi where f(x) = °) and Z { = , (4.1) 



the sequence (Xj)j e pj being a sequence of i.i.d. M(0, 1) Gaussian random variables 
and (x") ng pj being a sequence of real numbers satisfying 



R 



l n 

n = — y s x f — > r. 



n 

i=l 



We assume moreover that the support y of R is given by y = [m, M] and that 
sup x™ ► x max > M and inf xf > x mm < m. 

Ki<n n^oo l<i<n rc— >oo 

Our goal is to establish the LDP for L n and to describe as explicitely as possible the 
related rate function If. 

Remark 4.1. This example can be seen as the extension to the dimension 2 of the 
example studied in [5]. Indeed, under the same assumptions, Bercu et al. study the 
LDP for the following empirical mean ^ Yli=i x ?Xf ■ 

Proposition 14.11 below is devoted to the description of the rate function. We first 
need the following notations. For (£,£') £ M 2 , set 

r(£, £) = -\J M 1 - 2£ - 2x?)R(dx), (4.2) 

and denote by T* the convex conjugate of T (the expression for T follows from a 
Gaussian integration and from formula (|3.ip ). Define H to be the Hilbert transform 
of R, that is 

H(t) = j |^ for t € [m, M] c . 

Set 

l 

H mm — iiT(x m i n ) and ctmin — 



mm — -^mm „ ; 
^max — ^(^max) and Qmax — 



'max — -^max 
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Note that under the assumption that x m ; n < m and x max > M, H m \ n is a well- 
defined negative number while ff max is a well-defined positive number. In particular 
^min < Omin and a max < x max . Moreover, the following inequalities hold true: 

m < a m in — J x an d J x R(dx) < a ma , x < M. 

In particular, u m \ a < a max . In order to describe the rate function related to the 
LDP of L n , we introduce the following domains: 

£>oo = {(x,y) G K 2 , x < or y>x max x or y < x min x} 

f(/ f =r*) = {(x,y) G M 2 , x > and a min x < y < a max x} 

^linear = {(^Ji) £ R 2 , i > and a max x < y < x max x} 

^linear = {( X ^V) £ ^ X > and ^min^ < V < a min x} 

These domains are represented in Figure [3] (right). We can now state the following 
result. 

Proposition 4.1. The empirical mean L n defined in ()4.ip satisfies the LDP in M? 
with good rate funtion If given by 

(1) If (x,y) £ £>oo then If(x,y) = +oo, 

(2) If(x,y) G f(/ f =r*) then If(x,y) =T*(x,y), 

(3) //(*,!/) i/ien 

If\X, y) — r (ffmax(^max^ ?/)) O max i^ max (x max X ?/)) 

"I - ^ ((^ ^max^max) X + Hm&y^y) i 

(4) If(x,y)eV^ neai then 

If(x, y) — r (-ffmin(^min^ J/), C^ m i n -ff m i n (x m i n X £/)) 

"I - 77 ((^ -f^min^min)^ "I - -^min?/) . 



Remark 4.2. Let xo > be fixed and consider the ray: 

y {x) — X m i n X + (Omin Xmin)^0; X J> X(). 

Then 

I f (x,y~(x)) = r*(x ,amin^o) + ^{x - x ). 

In particular, there are infinitely many non-exposed points for If along the ray 
((x, y~(x)); x > xo). The same can be shown along the ray 

y (x) — a^max^ ~\~ (tt max s max )xo, X > Xo- 

Proof of Proposition \4-l\ The LDP will be established as soon as assumptions of 
Theorem 13.21 are fulfilled. It is straightforward to check (A-[T]) to (A-[3|) and (A-J5]). 
In order to check Assumption (A-|4]), we rely on the following lemma: 
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Lemma 4.2. For every x G [imiD^maxi one has: 

Pnoo/ of Lemma\4J$ Let (£,£') G ^f(x- min ) n £> f ( Xmax) . This implies that (£,x min £') G 
and (£, x max £') G 2?zi- Every x G [x m i n ,x max ] can be written as a convex 
combination of x m \ n and x max : x = ax m \ n + 6x max , where a + b = 1, a, 6 being 
nonnegative. By convexity of V Zl , = + ^max?') G X^. 

Therefore (£,£') 6%. □ 

We can now check (A-SJ). The mere definition of x m i n and x max implies that both 
x min and x max belong to C^ out and C^ in and that both C*, jOUt and C£, in are 
included in [x m i n ,x max ]. In particular, the set V is well defined and is given by: 

v = n %) ( =%»)n% mai) = n v w 

{x, f(x)ecL, ou J {x, f(x)e«£, ]in } 

where (a) and (6) follow from Lemma l4.2i An easy computation yields 

© = {(£, CO e K 2 ; 1 - 2^ - 2x min £' > and 1 - 2£ - 2x max £' > 0}. (4.3) 

The LDP is therefore established by applying Theorem 13.21 and the rate function is 
given by: 

h(z)= inf {T*( Zl ) + A*(z 2 \V)}, 

Z=Zl+Z 2 

with T> as above and T as defined in (|3.ip . Formula (|4.2p yields: 

domT = {(£,£') G 1 - 2£ - 2x£' > for all x G [m, Af]}, 
and therefore 

domT = {(£,£') el 2 ; 1 - 2f - 2m£' > and 1 - 2£ - 2M£' > 0}. (4.4) 
Figure [2] shows dom T and D for particular choices of the parameters. 

We first prove Proposition ^. II - (1). In order to prove this statement, it is equivalent 
to determine the domain of If. We use the fact that 

dom If = domT* + dom A*(- | T>) 

and focus on the two domains of the right-hand side. One can check that 

domT* = {(x,y) G M 2 ; x > and mi <y< Mx}, 

domA*(- | V) = {(x,y) G M 2 ; x > and x m j n x <y< x max x}. 

Therefore 

dom/f = {(x,y) G M 2 ; x > and x m ; n x < y < x max x}. (4.5) 
Note in particular that in this case, ri dom/f = dom/f . 

The three domains domT*, dom A*(- | T>) and dom If are represented on Figure 

El 
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A A 




FIGURE 2. On this figure are represented domT for m = — 1 and M = 1 
(left) and V for x m i n = —4 and x max = 4 (right). On the picture of T>, we 
figured also some of the normal cones to T>, whose directions are represented 
by the arrows. 




FIGURE 3. The left picture represents domT* (hatched cone) and 
domA*(- | T>) (delimited by the two half-lines y = 4x and y = — 4x). 
The right picture represents the four zones of K 2 where If has a particular 
expression. Zone (1) (resp. (2), (3) and (4)) represents V oo (resp. X>(/ f= r*)i 
■^ltnoar an d -^linear)- We kept ^ ne same values of the parameters as in Figure 
[2] and chose a particular R for which iJ max = —H min = 4/15. 



We now prove Proposition I4.1K 2). Theorem 13.61 yields: 

/ f (z) = sup{(A,z)-r(A)}. 

If one consider g z (X) = (A, z) — r(A), one can check that for z € domT*, an element 
A = (£,£') realizing the supremum of g z satisfies the condition 

1 V , 1 — 2f 

a — — -. r = — , with q = =- — . 

H(a) x' 2f 



LARGE DEVIATIONS FOR WEIGHTED EMPIRICAL MEAN 



21 



Therefore A G domrnP if and only if f 6 [amm, aWx] and in this case If (z) = T*(z). 

We now turn to the proof of Proposition ^. 11 - (3) . ^From Theorem l3.61 we just need 
to exhibit a decomposition z = z* + z n , where z* G dT(X*) and z n G Nf,(X*) for some 
A* G domT n V. In this case, the value of I f (z) is given by If(z) = F*(z*) + (X*,z a ). 
One can check that domTriP can be split into three subsets : the interior of T>, and 
the two half-lines {l-2£-2x min £' = 0,f < 1/2} and {l-2£-2x max £' = 0,f < 1/2}. 
The normal cones to T> are then easy to determine: 



- if (£,£') G int V, then Nfj(£,£') = {(0,0)}, 

- if £ < 1/2 and 1 - 2£ - 2x min ? = 0, then = {t(l, x min ), t > 0}, 

- if £ < 1/2 and 1 - 2£ - 2x max £' = 0, then = {t(l,x max ),t > 0}. 



These normal cones are represented by the arrows on Figure fright). 

We can now conclude the proof of the third point of the proposition. If we choose 



it is easy to check that this decomposition fulfills the required properties, i.e. z* G 




n 



Z — Z 



dF(X*) and z n G iVp(A*) for some A* G doml" n V. Therefore, 
I f (z) = T*(z*) + (X*,z n ) 



= T*(z*) + -(x + H min (y - x min x)) 



The decomposition z = z* + z n can be seen on Figure [H 

The proof of Proposition 14. II - (4) is very similar and is left to the reader. 



□ 




FIGURE 4. For a z = (x, y) such that x m i n x < y < a m i n x, we decompose 
z — z* + z n with z* such that y* = a m i n x* and z n = 3? m in), for a t > 0. 
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Remarks on the LDP and the spherical integral. We conclude this section 
with remarks related to the prime motivation of this study, namely the study of 
the asymptotics of spherical integrals. We recall from [9] that the goal is to get the 
asymptotics of 

I n (A n , B n ) = [e N T ^ A - UB ^dm n (U), (4.6) 



where A n and B n are two real diagonal matrices and m n is the Haar measure on the 
orthogonal group. Obtaining the asymptotic expansion of such integrals has major 
applications in statistics for instance. Indeed, the asymptotic expansion for the joint 
eigenvalue density of some deformed Wigner matrices can readily be deduced from 
the above integral. 

In the case where A n is of rank one, with a unique nonzero eigenvalue denoted 
by 6 and where B n = diag(x™, 1 < i < n) where ^ ^ <5 X « converges, the spherical 
integral can be written as 

I n (A n , B n ) = E exp (^ ^n x? ) , (4-7) 

where E is the expectation under the standard ^-dimensional Gaussian measure. 

A natural strategy to tackle the asymptotics of is then to establish the LDP for 
the empirical measure L n as studied in the previous example and to apply Varadhan's 
lemma to get the asymptotics of /„, (see [9j Theorem 6]). 

Beside the fact that we fully recover the LDP result of [9], we believe that the 
representation of the rate function (Theorem I3.6p sheds new light on the role played 
by the largest and lowest eigenvalues in the asymptotics of the rank-one spherical 
integral: The very reason comes from the fact that the individual rate function of 
the particle ^ f ^j* ) im fi ns the convexity assumption ( A-(2]) . This is in particular 
illustrated in Lemma [4.21 

In the forthcoming section, we study the LDP in the non-convex case, that is when 
(A-[2]) is not fulfilled. This will lead to partial results in the study of the asymptotics 
of the spherical integral beyond the rank-one case. 

5. The LDP in the non-convex case 

There are several models which fulfill Assumption (A-Q]) with a non-convex rate 
function. Take for instance the simple model Z\ = (Xf, Y±, X\Y\) where X\ and Y\ 
are independent standard Gaussian random variables. Denote by C = {(x,y,z) £ 
M 3 , z = —y/xy or z = ^/xy}, then ^ satisfies the LDP with good rate function 



J (%, V,z) = | + | + A (z I C) where A(z \ C) = j ^ ^ € ° , 

which is highly non-convex. We will see that this kind of models arises in the study 
of spherical integrals and may give rise to interesting phenomenas. 

We give in this section an assumption over the set A n = {x™ G X, 1 < i < n} 
which ensures the LDP for L n to hold. Although quite stringent, this assumption 
encompasses interesting models as we shall see. We then state the LDP. 

Recall that 3^ is the support of the limiting probability R. 
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Assumption A- 6. Assume that X C W for a given integer p. Denote by A n = 
{xf £ X, 1 < i < n}. Then there exists an integer T such that: 

T 

A n = A n U (J{s?J 

l=\ 

where p(A n ,y) goes to zero as n — > oo while for 1 < £ < T, 

x i, > x i i 

1 n— >oo 

where the xf 's do not belong to y. 

Remark 5.1. Assumption (A-1HJ) implies that there exists a finite number of outliers 
x™ e that remain outside the support y and that converge pointwise to a limit xf . 

Theorem 5.1. Assume that (Zj)j g N is a sequence of "\R. d -valued i.i.d random variables 
where Z\ satisfies (A^lty. Assume that (A\3$ and (A^) hold for the sequence (xf, 1 < 
i < n,n > 1). Then 



L n = -Y j i{xf)-Z i 

satisfies the LDP in (IR m , B(M m )) with good rate function 

{T T } 

r*(z ) + ^/(y £ ); z + J2 f (*t > )-yt = z\- 
i=i i=i J 

Proof. Recall that A n = A n U ULi"KU b y ( A "ED and write: 



T 



L n = - ^xf)-Z i + -J2i(xl).Z iv 

One can prove the LDP for ^ X^eA f (xf) ■ Z\ as in the proof of Theorem l3.2l (which 
relies on an adaptation of Theorem 2.1 in [11] and does not involve the convexity of 

I). On the other hand, YlJ=i ^ '„ — ~ is exponentially equivalent to Yle=i n ~ 
which satisfies the LDP with good rate function 

r T T ~\ 

J(z) =inf £ I(Vi), E f ( 



xT) 



U=i i=\ ) 

Since ^ J2 x n eA n ^ ( X D ' %i an d \ Yle=i ^( X ?J ' Zi e are independent, the LDP holds 
with good rate function If given by (|5.ip . Proof of Theorem 15.11 is completed. □ 

6. AN EXAMPLE OF LDP IN THE NON-CONVEX CASE: INFLUENCE OF THE 

SECOND LARGEST EIGENVALUE 

6.1. Presentation of the example. In this section, we shall study a simple model 
which underlines the differences between the LDP in the convex case and the LDP 
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in the non-convex one. Consider the set A n = {x™, 1 < i < n} where x™ = k\, 
x 2 = K 2 and x™ = 1 for i > 3. Assume the following: 

1 < K2 < K\. 

One can think of the x^ as the eigenvalues of a n x n matrix and one can check that 



i n 



n 

i=l 

while K\ and K2 are two outliers. 

In the sequel, we study the influence of the second largest eigenvalue K2 over the 
rate function of a given LDP in a convex and non-convex case. We prove that the 
second largest eigenvalue has no influence on the rate function that drives the LDP 
in the convex case (Proposition 16. lj ) while this eigenvalue has an impact on the LDP 
in the non-convex case (Proposition 16. 2p . We finally go back to spherical integrals 
and make some concluding remarks. 

Denote by f the following matrix-valued function: 

1 o 
o 1 

f(x)=\ ^ o 

x 


Let us now introduce the random variables we will consider. 

6.2. The convex model. Consider a family of M 3 -valued random variables (Zi)i>i 
satisfying Assumptions (A-[T]) and (A-[2]). Denote by 



n 

L n {Z) = -Y,t{tf)-Zi 

i=i 

i i i n 

-f («i) • Z 1 + -f («2) • Z 2 + - V f (x?) ■ Zt 
n n n ^-^ 



A 



andbyL n (Z) = ir^Z) + L n {Z) 



ir l n {Z) + K 2 n {Z) + L n (Z) 



One can apply Theorem 13.21 to L n {Z) and L n {Z) which therefore satisfy LDPs 
with given rate functions that we denote respectively by Iz and Iz- 

Proposition 6.1. The rate functions Iz and Iz related to the LDPs of L n (Z) and 
L n {Z) are equal. 

Remark 6.1. This proposition underlines the fact that the second largest eigenvalue 
does not have any influence on the rate function of the LDP. 

Proof. Let 

Zi = I Vi then f (x) • Zi 
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For AeR 5 , denote by 

A(A) = InEe^'W-^, 
Aj(A) = InEe^'M - ^, ie{l,2}. 
Consider also the associated domains: 

p = {AG R 5 ; A(A) < oo}, 

T>i = {A G M 5 ; Aj(A) < oo}, i e {1,2}. 

Remark that 

A = (a, 0, 7, 6, 9) G T>i & \ i = (a, (3,^,^5,9) eT> , i£ {1,2}. (6.1) 
^From Theorem 13.21 we know that 



I z (z) = sup {(A,z) - A(A)} and I z (z) = sup {(A, z) - A(A)} 
A€X> nx>inx> 2 AG© nDi 

We now prove that A G VqHVi implies that A G T>2- Let A = (a, (3, 7, 5, 9) G PoHPi. 
iFrom dSU, 

A G Pi =^ Ai = (a, /3, K17, 0) G Do- 
Moreover, as 1 < Ki < «i, K2 can be written as K2 = a + frfvi, with a, & non-negative 
and a + b = 1. Due to the convexity of 2?0) we have that aA + b\± G T>q. On the other 
hand, 

aA + bX\ = (a, (3, k 2 7, k 2 S, 9), 
so that A G T>2 by (|6.ip . Therefore, 

I z (*) = sup {(A,z)-A(A)} 
Aei>onx>inx>2 

= sup {(A,z)-A(A)} = I z {z) 
Aer> nx>i 

and the proof of Proposition 16.11 is completed. 

□ 

6.3. The non-convex model. Let pQ)j>i and (li)j>i be two independent fami- 
lies of i.i.d. standard Gaussian random variables and consider the i.i.d. M 3 -valued 
random variables 

Zi = I Y? 



XiYi 



We shall study the LDP of 

n 

L n (Z) = -J2^i)'Zi 

n — * 



n 

i=l 





( x 4 > 


















il 








I+ 1 


n \ 


I j 




1 K 2 V| , 


1 n 








\ x 2 v 2 / 





i=3 




Kl{Z)+itl{Z) + L n {Z) 
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As above, we also introduce L n (Z) = ~k\{Z) + L n {Z). 

The non-convex model satisfies assumptions of Theorem 15.11 Therefore, both 
L n (Z) and L n (Z) satisfy the LDP with given rate functions that we denote respec- 
tively by I z and I z . 

We shall prove the following: 

Proposition 6.2. Let k\ < 2k 2 — 1. The rate function I z that drives the LDP for 
L n (Z) differs from the rate function I z that drives the LDP for L n {Z). 

Remark 6.2. Proposition 16.21 illustrates the influence of the second largest eigenvalue 
on the rate function of the LDP in the non-convex case. Note that the condition 
Ki < 2k 2 — 1 is merely technical and yields to easier computations. 

Proof. In order to prove Proposition 16.21 we shall prove that there exists some point 
z* such that 

I z (z*) < oo while i~z( z *) = °°- 
Denote by z = (x, y, x' , y' , r) and by A the convex set 

A = {z G M 5 ; x > 0, y > 0, x = x, y = y, r 2 < xy}. 

Then Cramer's theorem yields the LDP for L n {Z) with good rate function 

r*(z) = X -^y~ - \ log(xy - r 2 ) + A(z \ A). 
Denote by B K the following non-convex set: 

B K = {z G IR 5 ; x > 0, y > 0, x' = kx, y' = ny, \r\ = ^/xy\ 
One can prove that 7r^(Z) and 7r^(Z) satisfy the LDP with respective rate functions 

h{z) = ^ + A(z | B K1 ) and L 2 (z) = ^ + A(z \ B K2 ). 
The contraction principle then yields 

Iz(*) = , in i {T*(z ) + I 1 (z 1 )+L 2 (z 2 )} 
I z (z) = inf {T*(z )+I 1 (z 1 )} 

Z +Zl=Z 

Let z* = (1, 1, k 2 , ^2, 0) then we shall prove that 

I z (z*) < oo while ^0*) = oo. (6.2) 

This will complete the proof of Proposition 16.21 

In the sequel, we use the notation Zi = {xi^yi^x'^y'^ri) with i G {0,1,2}. From 
the definition of L z , one can easily check that I z (z*) is finite iff the following system 
of equations: 

x + xi = 1 

yo + yi = i 

Xq + K\X\ = K 2 (6.3) 

yo + n\yi = k 2 
xiyi < x y 
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has a solution such that xq > 0, yo > 0, x\ > and y\ > 0. From easy computations, 
such a solution should satisfy 

K\ — K-2 



X 



K\ — 1 



yo- 



(6.4) 



On the other hand, the last equation of f|6.3|) implies that (1 — x$) 2 < Xq, that is 
xq > ^. As we have assumed that ki < 2^2 — 1, this is not compatible with ([6 
and 



oo. 



We now prove that I^z*) < oo. The mere definition of 1% yields that Ig{z*) < oo 
iff there exists a solution to the following system 



Xq + X\ + X 2 = 1 

yo + y\ + 2/2 = i 

X + KlXl + K 2 X 2 

yo + Kiyi + K 2 y 2 



~- K 2 

K-2 



r"o + eixiyi + e 2 x 2 y 2 







satisfying x > 0, y > 0, > 0, y\ > 0, x 2 >0, y 2 > 0, ei )2 - xi a.m , 
We can easily check that this system admits the following solution: 

Kl - K 2 



(6.5) 



±1 and rl < x y . 



xo = yo 

xi = yi 
ei = -e 2 
Therefore, (|6.2p is proved. 



«r + ^2 - 2 ' 
k 2 - 1 

Kl + K 2 - 2 

— 1 and 



x 2 -- 
ro 



V2, 

0. 



□ 



6.4. Links with the spherical integral beyond the rank-one case. When one 
wants to study the asymptotics of the spherical integral in the case when the matrix 
A n in (|4.6p is of finite rank larger than one, one is led to study the Large Deviations 
for empirical means which do not fulfill the convexity assumption (Assumption (A- 
[2])). For example, in the rank two case, the related empirical mean to look at is given 
by: 



4 2) = "E f(2) ^)-4 withZ, 




and f (2) (a 



and Theorem 15.11 applies whenever (A-[6]) is fulfilled. It is then an easy application of 
Varadhan's Lemma to get the convergence of the spherical integrals in the rank two 
case (and analogously for an arbitrary finite rank). The example studied in Section 
6.31 supports the feeling (although in a very indirect way) that the asymptotics of 
the spherical integral in this case should depend not only on the largest eigenvalue 
(as proved in the rank-one case in [9]) but also on the second largest eigenvalue and 
maybe on other ones, the number of which is related to the rank of A n . Unfortu- 
natelly, the very intricate formula of the rate function associated to the LDP in the 
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non-convex case gives little clue on how to relate the asymptotics of the spherical 
integral to the largest eigenvalues beyond the rank-one case. 

Appendix A. Proof of Lemma 12.41 

Proof. Let e > be fixed. Note that ^ by Assumption (A-13]). Since exists 
by (|2.4p and is compact by (A-(3]), there exists a finite number of m x d matrices 
(ai, • • • , SL p ) such that 

CL C U<! =1 5(a fe , e) where fl(a*, e) ={y 6 R mxd , |y - a*| < e}. 

^From the cover (B(&k,£), 1 < k < p), one can easily build a partition (T^, 1 < k < 
p') where p' < p with the following properties: 

- c u fc=i r fc> 

- sup{|x-x'|,(x,x') £ r^} < 2e, 

- int(r fe ) nC^0 for 1 < jfe < p' (in particular int(T fc ) / 0). 
Let bfc )E be an element of int(Ffc) n C^. Denote by 

p' 

F(x) =^b M lr fe (f(x)), x £ X and P e = nf =1 P bfc£ . 
fc=l 

We will prove in the sequel the following facts: 

(1) The partial weighted empirical mean 7r^ defined by 



£ = £ £ f£ (^) • ^ 



n 

satisfies the LDP with good rate function A*(z | 2? e ) = sup{(z, A), A G T> £ }. 

(2) The family of random variables (vr^,e > 0) is an exponential approximation 
of (7r n ), i.e. 

limlimsup — logP{|7r,^ — 7r n | > 5} = — oo, V5 > 0. 

e >0 n — >oo ^ 

(3) Finally, the family (7r n ,n > 1) satisfies the LDP with good rate function 
A*(z | V). 

Let us first prove fact (1). 

< = l £ n*?)-Zi=^- £ E 

Since the sets (Tf.) are disjoints, the partial empirical means \ Sx n ef~ 1 (r fc ) ^ are 
independent. Denote by ^>fe(n) the cardinality of the set {x",f(x") G r^}. One has 
to check that 

lim - = and 4>k{n) > 1 for n large enough. 

Since 4>k( n ) < card(C n ), the first point is proved. Recall now that int(r fe )nC^ / 0. 
Thus Condition (|2.4j) yields that for n large enough, there always exist points of 
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that belong to T^- In particular, (f>k(n) > 1 eventually. Therefore, Lemma 12.31 yields 
the LDP for ^ J2 x n ef- 1 (r k ) with good rate function I(y). 

A straightforward application of the contraction principle (7[ Theorem 4.2.1] yields 
the LDP for ir^ with good rate function 

r P ' P > 

a* £ (z) = mflj2 A *(y* I p ^)> E b ^ • 

yk=l k=l 

We prefer the following representation which expresses the rate function A* as an 
inf-convolution : 

( p' p' ) 

A* £ (z) = inf I A *(^ I ^VJ, E z * = 4 • (A- 1 ) 
[fc=i fc=i J 

The rate function A* is lower semi-continuous therefore |13l Theorem 16.4] yields: 

a: = sup | (a,*)- y, A (*i^O 

= sup {(\,z) -A(z\ ni< fc <p'P bfe ,J} 



= A*(z | ni< fc <p/©b fcie ) = A*{z | n^fc^Pb^J = A*(z I v £ ). 

where (a) follows from Proposition 12.11 Fact (1) is proved. 
Let us now prove fact (2). We have 



vr n |<i £ |f(x?)-f(xf)||2i 



x"ec„ 



By the definition of f £ , if f(x?) € T fc then f e (x") = b fc , e and |f(x") - b fcj£ | < 2e. 
Therefore — ir n \ < ^ X^eC™ 1^*1 an ^ 



/ u5k\ /_ K i z .|\card(C*„) 

< exp I — — j (^Ee' t ^' 1 



where k > is such that Ee K ^l < oo. Therefore 

1 k5 

limsup — logP{|7r, £ — 7r n | > 5} < > — oo, 

n->oo n 2e e^O 

which proves the exponential equivalence. Fact (2) is proved. 

We now prove fact (3). Since (7r £ ,e > 0) is an exponential approximation of 7r n , 
Theorem 4.2.16 (a) in [7] implies that 7r n satisfies a weak LDP with rate function 
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given by: 



T(z) = supliminf inf A*(z') = suplimsup inf A*(z'), 

5>0 £ -> z'eB(z,S) s>0 e-+0 z'eB{z,8) 



where (*) is a by-product of the proof of Theorem 4.2.16] (see Eq. (4.2.19) for 
instance). This precisely means that T is the epigraphical limit of A* (see |14[ 
Chapter 7] for details). In order to prove that T = A*(- | T>), we first note that 



e-*0 

A corollary [14\ Corollary 11.35(a)] of Wijsman's theorem |14t Theorem 11.34] im- 
mediatly yields: 

T(z) = A*(z | V) = epi-lim A*(z | V £ ), (A.2) 

£^0 

where epi-lim denotes the epigraphical limit. Since A*(z \ T>) = A*{z \ T>) by 
Proposition I2.1| we have T = A*(- | T>). Fact (3) is thus proved and so is Lemma 

□ 
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